SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: Palese, Peter 

O'Neill, Robert 

(ii) TITLE OF INVENTION: IDENTIFICATION AND USE OF ANTIVIRAL 

COMPOUNDS THAT INHIBIT INTERACTION OF HOST CELL PROTEINS 
AND VIRAL PROTEINS REQUIRED FOR VIRAL REPLICATION 



(iii) NUMBER OF SEQUENCES: 20 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Pennie & Edmonds 

(B) STREET: 1155 Avenue of the Americas 

(C) CITY: New York 

(D) STATE: New York 

(E) COUNTRY: USA 

(F) ZIP: 10036-2711 

(V) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS /MS-DOS 

(D) SOFTWARE: Patent In Release #1.0 , Version #1.30 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08/444,994 

(B) FILING DATE: 19-MAY-1995 

( C ) CLASSIFICATION : 

(viii) ATTORNEY /AGENT INFORMATION: 

(A) NAME: Coruzzi, Laura A. 

(B) REGISTRATION NUMBER: 30,742 

(C) REFERENCE /DOCKET NUMBER: 6923-054 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (212) 790-9090 

(B) TELEFAX: (212) 869-9741/8864 

(C) TELEX: 66141 PENNIE 



(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 
GCAAAGCAGG AGAAACCAC 
(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 
GGGTCCATCT GATAGATATG AGAG 
(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 48 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(ix) FEATURE: 

(A) NAME /KEY: modif ied_base 

(B) LOCATION: 36 

(D) OTHER INFORMATION: /mod_base= i 

(ix) FEATURE: 

(A) NAME /KEY : modif ied_base 

(B) LOCATION: 37 

(D) OTHER INFORMATION: /mod_base= i 

( ix ) FEATURE : 

(A) NAME/KEY: modif ied_base 

(B) LOCATION: 41 

(D) OTHER INFORMATION: /mod_base= i 

( ix ) FEATURE : 

(A) NAME/KEY: modif ied_base 

(B) LOCATION: 42 

(D) OTHER INFORMATION: /mod_base= i 

( ix ) FEATURE : 

(A) NAME/KEY: modif ied_base 

(B) LOCATION: 46 

(D) OTHER INFORMATION: /mod_base= i 

(ix) FEATURE: 

(A) NAME/KEY: modif ied_base 

(B) LOCATION: 47 

(D) OTHER INFORMATION: /mod_base= i 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
CUACUACUAC UAGGCCACGC GTCGACTACT ACGGGNNGGG NNGGGNNG 
(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 
TCCTGATGTT GCTGTAGACG 
(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
GCACGACTAG TATGATTTGC 
(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Thr Gly Ala Gly Ala Gly Leu Gly 
1 5 

(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : 

( D ) TOPOLOGY : unknown 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7 

Tyr Ser Ala Ala Lys 
1 5 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : u nknown 
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(ii) MOLECULE TYPE: DNA 



( ix ) FEATURE : 

(A) NAME /KEY : CDS 

(B) LOCATION: 1..27 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:8: 

GAC TGG CTG GAA TTC CCC ATG GCG TCC 
Asp Trp Leu Glu Phe Pro Met Ala Ser 
1 ' 5 

(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

Asp Trp Leu Glu Phe Pro Met Ala Ser 
1 5 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2940 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: unknown 

( D ) TOPOLOGY : unknown 

(ii) MOLECULE TYPE: cDNA 

( ix ) FEATURE : 

(A) NAME/KEY: CDS 

(B) LOCATION: 47.. 1663 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

CTAACTTCAG CGGTGGCACC GGGATCGGTT GCCTTGAGCC TGAAAT ATG ACC ACC 

!Met ^Tnr jCf^r 

i 

CCA GGA AAA GAG AAC TTT CGC CTG AAA AGT TAC AAG AAC AAA TCT CTG 
Pro Gly Lys Glu Asn Phe Arg Leu Lys Ser Tyr Lys Asn Lys Ser Leu 
5 10 15 

AAT CCC GAT GAG ATG CGC AGG AGG AGG GAG GAA GAA GGA CTG CAG TTA 
Asn Pro Asp Glu Met Arg Arg Arg Arg Glu Glu Glu Gly Leu Gin Leu 
20 25 30 35 

CGA AAG CAG AAA AGA GAA GAG CAG TTA TTC AAG CGG AGA AAT GTT GCT 
Arg Lys Gin Lys Arg Glu Glu Gin Leu Phe Lys Arg Arg Asn Val Ala 
40 45 =0 

ACA GCA GAA GAA GAA ACA GAA GAA GAA GTT ATG TCA GAT GGA GGC TTT 
Thr Ala Glu Glu Glu Thr Glu Glu Glu Val Met S r Asp Gly Gly Phe 
55 60 65 
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CAT GAG GCT CAG ATT AGT AAC ATG GAG ATG GCA CCA GGT GGT GTC ATC 
His Glu Ala Gin He Ser Asn Met Glu Met Ala Pro Gly Gly Val He 
70 75 80 

ACT TCT GAC ATG ATT GAG ATG ATA TTT TCC AAA AGC CCA GAG CAA CAG 
Thr Ser Asp Met He Glu Met He Phe Ser Lys Ser Pro Glu Gin Gin 
85 90 95 

CTT TCA GCA ACA CAG AAA TTC AGG AAG CTG CTT TCA AAA GAA CCT AAC 
Leu Ser Ala Thr Gin Lys Phe Arg Lys Leu Leu Ser Lys Glu Pro Asn 
100 105 HO H5 

CCT CCT ATT GAT GAA GTT ATC AGC ACA CCA GGA GTA GTG GCC AGG TTT 
Pro Pro He Asp Glu Val He Ser Thr Pro Gly Val Val Ala Arg Phe 
120 125 130 

GTG GAG TTC CTC AAA CGA AAA GAG AAT TGT TCA CTG CAG TTT GAA TCA 
Val Glu Phe Leu Lys Arg Lys Glu Asn Cys Ser Leu Gin Phe Glu Ser 
135 140 145 

GCT TGG GTA CTG ACA AAT ATT GCT TCA GGA AAT TCT CTT CAG ACC CGA 
Ala Trp Val Leu Thr Asn He Ala Ser Gly Asn Ser Leu Gin Thr Arg 
150 155 160 

ATT GTG ATT CAG GCA AGA GCT GTG CCC ATC TTC ATA GAG TTG CTC AGC 
He Val He Gin Ala Arg Ala Val Pro He Phe He Glu Leu Leu Ser 
165 170 175 

TCA GAG TTT GAA GAT GTC CAG GAA CAG GCA GTC TGG GCT CTT GGC AAC 
Ser Glu Phe Glu Asp Val Gin Glu Gin Ala Val Trp Ala Leu Gly Asn 
180 185 190 195 

ATT GCT GGA GAT AGT ACC ATG TGC AGG GAC TAT GTC TTA GAC TGC AAT 
He Ala Gly Asp Ser Thr Met Cys Arg Asp Tyr Val Leu Asp Cys Asn 
200 205 210 

ATC CTT CCC CCT CTT TTG CAG TTA TTT TCA AAG CAA AAC CGC CTG ACC 
He Leu Pro Pro Leu Leu Gin Leu Phe Ser Lys Gin Asn Arg Leu Thr 
215 220 225 

ATG ACC CGG AAT GCA GTA TGG GCT TTG TCT AAT CTC TGT AGA GGG AAA 
Met Thr Arg Asn Ala Val Trp Ala Leu Ser Asn Leu Cys Arg Gly Lys 
230 235 240 

AGT CCA CCT CCA GAA TTT GCA AAG GTT TCT CCA TGT CTG AAT GTG CTT 
Ser Pro Pro Pro Glu Phe Ala Lys Val Ser Pro Cys Leu Asn Val Leu 
245 250 255 

TCC TGG TTG CTG TTT GTC AGT GAC ACT GAT GTA CTG GCT GAT GCC TGC 
Ser Trp Leu Leu Phe Val Ser Asp Thr Asp Val Leu Ala Asp Ala Cys 
260 265 270 275 

TGG GCC CTC TCA TAT CTA TCA GAT GGA CCC AAT GAT AAA ATT CAA GCG 
Trp Ala Leu Ser Tyr Leu Ser Asp Gly Pro Asn Asp Lys He Gin Ala 
280 285 290 

GTC ATC GAT GCG GGA GTA TGT AGG AGA CTT GTG GAA CTG CTG ATG CAT 
Val He Asp Ala Gly Val Cys Arg Arg Leu Val Glu Leu Leu Met Ha_s 
* 295 300 305 

AAT GAT TAT AAA GTG GTT TCT CCT GCT TTG CGA GCT GTG GGA AAC ATT 
Asn Asp Tyr Lys Val Val Ser Pro Ala Leu Arg Ala Val Gly Asn He 
310 315 320 

GTC ACA GGG GAT GAT ATT CAG ACA CAG GTA ATT CTG AAT TGC TCA GCT 
Val Thr Gly Asp Asp He Gin Thr Gin Val He Leu Asn Cys Ser Ala 
325 330 335 
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CTG CAG AGT TTA TTG CAT TTG CTG AGT AGC CCA AAG GAA TCT ATC AAA 
Leu Gin Ser Leu Leu His Leu Leu Ser Ser Pro Lys Glu Ser He Lys 
340 345 350 355 

AAG GAA GCA TGT TGG ACG ATA TCT AAT ATT ACA GCT GGA AAT AGG GCA 
Lys Glu Ala Cys Trp Thr He Ser Asn He Thr Ala Gly Asn Arg Ala 
360 365 370 

CAG ATC CAG ACT GTG ATA GAT GCC AAC ATT TTC CCA GCC CTC ATT AGT 
Gin He Gin Thr Val He Asp Ala Asn He Phe Pro Ala Leu He Ser 
375 380 385 

ATT TTA CAA ACT GCT GAA TTT CGG ACA AGA AAA GAA GCA GCT TGG GCC 
He Leu Gin Thr Ala Glu Phe Arg Thr Arg Lys Glu Ala Ala Trp Ala 
390 395 400 

ATC ACA AAT GCA ACT TCT GGA GGA TCA GCT GAA CAG ATC AAG TAC CTA 
He Thr Asn Ala Thr Ser Gly Gly Ser Ala Glu Gin He Lys Tyr Leu 
405 410 415 

GTA GAA CTG GGT TGT ATC AAG CCG CTC TGT GAT CTC CTC ACG GTC ATG 
Val Glu Leu Gly Cys He Lys Pro Leu Cys Asp Leu Leu Thr Val Met 
420 425 430 435 

GAC TCT AAG ATT GTA CAG GTT GCC CTA AAT GGC TTG GAA AAT ATC CTG 
Asp Ser Lys He Val Gin Val Ala Leu Asn Gly Leu Glu Asn lie Leu 
440 445 450 

AGG CTT GGA GAA CAG GAA GCC AAA AGG AAC GGC ACT GGC ATT AAC CCT 
Arg Leu Gly Glu Gin Glu Ala Lys Arg Asn Gly Thr Gly lie Asn Pro 
* 455 460 465 

TAC TGT GCT TTG ATT GAA GAA GCT TAT GGT CTG GAT AAA ATT GAG TTC 
Tvr Cys Ala Leu He Glu Glu Ala Tyr Gly Leu Asp Lys He Glu Phe 
y 470 475 480 

TTA CAG AGT CAT GAA AAC CAG GAG ATC TAC CAA AAG GCC TTT GAT CTT 
Leu Gin Ser His Glu Asn Gin Glu He Tyr Gin Lys Ala Phe Asp Leu 
485 490 495 

ATT GAG CAT TAC TTC GGG ACC GAA GAT GAA GAC AGC AGC ATT GCA CCC 
He Glu His Tyr Phe Gly Thr Glu Asp Glu Asp Ser Ser He Ala Pro 
500 505 510 515 

CAG GTT GAC CTT AAC CAG CAG CAG TAC ATC TTC CAA CAG TGT GAG GCT 
Gin Val Asp Leu Asn Gin Gin Gin Tyr He Phe Gin Gin Cys Glu Ala 
520 525 530 

CCT ATG GAA GGT TTC CAG CTT TGA AGCAATACTC TGCTTTCACG TACCTGTGCT 
Pro Met Glu Gly Phe Gin Leu * 
535 

CAGACCAGGC TACCCAGTCG AGTCCTCTTG TGGAGCCCAC AGTCCTCATG GAGCTAACTT 
CTCAAATGTT TTCCATAATA CTGTTTGCGC TCATTTGCTT GCCTTGCGCA CCTGCTCTCT 
TACACACATC TGGAAAACCT CCGGCTCTCT GTGGTGGGAT ACCCTTCTAA TAAAAGGGTA 
ACCAGAACGG CCCACTCTCT TTTACGGAAA AATCCCTAGG CTTTGGAGAT CCGCACTTAC 
ATTAGAGTTA TGGGAATATA CACATATTAA TGTGGCTCCC TTTTTCTTGT GGGGGAATAA 
AAGAGGACTC CTCCTCATTC CCTTTAACAT GGGGGAAAAA ACTGACATTA AAAGATGAGA 
CTAAATCTTT ATCTTGAATT TTACACAACT ACTTACGACA AGGG AG ATG T TTAGACCTGT 
TGGTATACTT CAGAGTACTT TTCATGAGTT CTTCCACAGT GAACCCTTGG ATTACCTGGT 



1111 

1159 

1207 

1255 

1303 

1351 

1399 

1447 

1495 

1543 

1591 

1639 

1693 

1753 
1813 
1873 
1933 
1993 
2053 
2113 
2173 
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GGCTTTTTCT AGCCAGATTG CATTAATCCT TACTGAGATT GGATGGTTTT CTTTCCTCTA 2233 

TTGGCGCCAT TCTTCAGATA TTAAAGTTAA ACCATCCACT CCCTCACCTT CAGCCTTCAG 2293 

TGAATGTGCT TTCTAGTTGT CAGGAATGCT GAAGAATTAA CACTTTGACT CCTAAATGTG 2353 

ATACTGGTGG GTAAGAGCAG GGCACATTTA ATTTGTTCGC TTTTGCTTCT CTTTGGTCTG 2413 

GGCACATTTA ATTTGTTCGC TTTTGCTTCT CTTTGGTCTT TTCGAATACT TAGTAATCGA 2473 

AAACCATATC CTGTAATTTA ATAAAAAAAA CTAAGGACGA AAAAACCCCT CCAATTTTCC 2533 

CAAATGCAAT CAGTGTAACT AGGGGCTGTG TTTCTGCATT AAAATAAATG TTTCAGGCTT 2593 

TGTGGTCCTG ATCAAGGTCC TCATTAAAAA ATTGGAGTTC ACCCTAGGCT TTTCCCCTCT 2653 

GTGACTGGCA GATAACACAT ACTTTTGAAA GTAACTTTGG GATTTTTTTT CTTAGGTGCA 2713 

GCTCGATTCT AATCTTTTCA TGCTGCACAC GATTCCTTTA ATCGATAGCA TCCTTATCTG 2773 

AAAGAAATAA CCATCTTCTC AACATGACCT GCTTAACCCA AATAAGAACA GTGATCTTAT 2833 

AACCTCATTG TTTCCTAATC TATTTTATTT CATCTCCTGC TAGTACTGTG CCGCTTCCCC 2893 

CTCCCCCCAC ACAAAATAAA AACAGTATCT CGCTTCTGGC TCATTTT 2940 

<2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 539 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

Met Thr Thr Pro Gly Lys Glu Asn Phe Arg Leu Lys Ser Tyr Lys Asn 
1 5 10 15 

Lvs Ser Leu Asn Pro Asp Glu Met Arg Arg Arg Arg Glu Glu Glu Gly 
J ~- 25 30 



20 



Leu Gin Leu Arg Lys Gin Lys Arg Glu Glu Gin Leu Phe Lys Arg Arg 
35 40 45 

Asn Val Ala Thr Ala Glu Glu Glu Thr Glu Glu Glu Val Met Ser Asp 
50 55 60 

Glv Gly Phe His Glu Ala Gin He Ser Asn Met Glu Met Ala Pro Gly 
65 70 75 80 

Gly Val He Thr Ser Asp Met He Glu Met He Phe Ser Lys Ser Pro 
85 90 95 

Glu Gin Gin Leu Ser Ala Thr Gin Lys Phe Arg Lys Leu Leu Ser Lys 
100 105 HO 

Glu Pro Asn Pro Pro He Asp Glu Val He Ser Thr Pro Gly Val Val 
115 120 125 

Ala Arg Phe Val Glu Phe Leu Lys Arg Lys Glu Asn Cys Ser Leu Gin 
130 135 140 

Phe Glu Ser Ala Trp Val Leu Thr Asn He Ala Ser Gly Asn Ser Leu 
145 150 155 160 
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Gin Thr Arg He Val He Gin Ala Arg Ala Val Pro He Phe lie Glu 
165 170 17 b 

L u Leu Ser Ser Glu Phe Glu Asp Val Gin Glu Gin Ala Val Trp Ala 
180 185 190 

Leu Gly Asn He Ala Gly Asp Ser Thr Met Cys Arg Asp Tyr Val Leu 
195 ~ 200 205 

Asp Cys Asn He Leu Pro Pro Leu Leu Gin Leu Phe Ser Lys Gin Asn 
P 210 215 220 

Arg Leu Thr >Met Thr Arg Asn Ala Val Trp Ala Leu Ser Asn Leu Cys 
225 230 235 240 

Arg Gly Lys Ser Pro Pro Pro Glu Phe Ala Lys Val Ser Pro Cys Leu 
245 250 255 

Asn Val Leu Ser Trp Leu Leu Phe Val Ser Asp Thr Asp Val Leu Ala 
260 265 270 

Asp Ala Cys Trp Ala Leu Ser Tyr Leu Ser Asp Gly Pro Asn Asp Lys 
275 280 285 

lie Gin Ala Val He Asp Ala Gly Val Cys Arg Arg Leu Val Glu Leu 
290 295 300 

Leu Met His Asn Asp Tyr Lys Val Val Ser Pro Ala Leu Arg Ala Val 
305 310 315 320 

Gly Asn He Val Thr Gly Asp Asp He Gin Thr Gin Val He Leu Asn 
325 330 335 

Cys Ser Ala Leu Gin Ser Leu Leu His Leu Leu Ser Ser Pro Lys Glu 
340 345 350 

Ser He Lys Lys Glu Ala Cys Trp Thr He Ser Asn lie Thr Ala Gly 

360 365 



355 



Asn Arg Ala Gin He Gin Thr Val He Asp Ala Asn He Phe Pro Ala 
370 375 380 

Leu He Ser He Leu Gin Thr Ala Glu Phe Arg Thr Arg Lys Glu Ala 
385 390 395 400 

Ala Trp Ala He Thr Asn Ala Thr Ser Gly Gly Ser Ala Glu Gin He 
405 410 415 

Lys Tyr Leu Val Glu Leu Gly Cys He Lys Pro Leu Cys Asp Leu Leu 
420 425 430 

Thr Val Met Asp Ser Lys He Val Gin Val Ala Leu Asn Gly Leu Glu 
435 " 440 445 

Asn He Leu Arg Leu Gly Glu Gin Glu Ala Lys Arg Asn Gly Thr Gly 
450 " 455 460 

He Asn Pro Tyr Cys Ala Leu He Glu Glu Ala Tyr Gly Leu Asp Lys 
465 470 475 480 

He Glu Phe Leu Gin Ser His Glu Asn Gin Glu He Tyr Gin Lys Ala 
485 490 495 

Phe Asp Leu He Glu His Tyr Phe Gly Thr Glu Asp Glu Asp Ser Ser 
500 505 510 

He Ala Pro Gin Val Asp Leu Asn Gin Gin Gin Tyr He Phe Gin Gin 
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515 520 525 

Cys Glu Ala Pro Met Glu Gly Phe Gin Leu * 
530 535 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 542 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

( D ) TOPOLOGY : unknown 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

Met Asp Asn Gly Thr Asp Ser Ser Thr Ser Lys Phe Val Pro Glu Tyr 
15 10 15 

Arg Arg Thr Asn Phe Lys Asn Lys Gly Arg Phe Ser Ala Asp Glu Leu 
20 ~ 25 30 

Arg Arg Arg Arg Asp Thr Gin Gin Val Glu Leu Arg Lys Ala Lys Arg 
35 ~ 40 45 

Asp Glu Ala Leu Ala Lys Arg Arg Asn Phe He Pro Pro Thr Asp Gly 
50 55 60 

Ala Asp Ser Asp Glu Glu Asp Glu Ser Ser Val Ser Ala Asp Gin Gin 
65 70 75 80 

Phe Tyr Ser Gin Leu Gin Gin Glu Leu Pro Gin Met Thr Gin Gin Leu 
85 90 95 

Asn Ser Asp Asp Met Gin Glu Gin Leu Ser Ala Thr Val Lys Phe Arg 
100 105 HO 

Gin He Leu Ser Arg Glu His Arg Pro Pro He Asp Val Val He Gin 
115 " 120 125 

Ala Gly Val Val Pro Arg Leu Val Glu Phe Met Arg Glu Asn Gin Pro 
130 135 140 

Glu Met Leu Gin Leu Glu Ala Ala Trp Ala Leu Thr Asn He Ala Ser 
145 150 155 160 

Gly Thr Ser Ala Gin Thr Lys Val Val Val Asp Ala Asp Ala Val Pro 
165 170 175 

Leu Phe He Gin Leu Leu Tyr Thr Gly Ser Val Glu Val Lys Glu. Gin 
180 185 190 

Ala He Trp Ala Leu Gly Asn Val Ala Gly Asp Ser Thr Asp Tyr Arg 
195 ' 200 205 

Asp Tyr Val Leu Gin Cys Asn Ala Met Glu Pro He Leu Gly Leu Phe 
210 215 220 

Asn Ser Asn Lys Pro Ser Leu He Arg Thr Ala Thr Trp Thr Leu Ser 
225 230 235 2"40 

Asn Leu Cys Arg Gly Lys Lys Pro Gin Pro Asp Trp Ser Val Val Ser 
245 250 255 
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Gin Ala Leu Pro Thr Leu Ala Lys Leu He Tyr Ser Met Asp Thr Glu 
260 265 270 

Thr Leu Val Asp Ala Cys Trp Ala He Ser Tyr Leu Ser Asp Gly Pro 
275 * 280 285 

Gin Glu Ala He Gin Ala Val He Asp Val Arg He Pro Lys Arg Leu 
290 295 300 

Val Glu Leu Leu Ser His Glu Ser Thr Leu Val Gin Thr Pro Ala Leu 
305 310 315 320 

Ara Ala Val Gly Asn He Val Thr Gly Asn Asp Leu Gin Thr Gin Val 
" 325 330 335 

Val He Asn Ala Gly Val Leu Pro Ala Leu Arg Leu Leu Leu Ser Ser 
340 345 350 

Pro Lys Glu Asn He Lys Lys Glu Ala Cys Trp Thr He Ser Asn He 
355 " 360 365 

Thr Ala Gly Asn Thr Glu Gin He Gin Ala Val He Asp Ala Asn Leu 
370 375 380 

He Pro Pro Leu Val Lys Leu Leu Glu Val Ala Glu Tyr Lys Thr Lys 
385 390 395 400 

Lys Glu Ala Cys Trp Ala He Ser Asn Ala Ser Ser Gly Gly Leu Gin 
405 410 415 

Arg Pro Asp He He Arg Tyr Leu Val Ser Gin Gly Cys lie Lys Pro 
y " 420 425 430 

Leu Cys Asp Leu Leu Glu He Ala Asp Asn Arg He He Glu Val Thr 
435 440 445 

Leu Asp Ala Leu Glu Asn He Leu Lys Met Gly Glu Ala Asp Lys Glu 
450 455 460 

Ala Arg Gly Leu Asn He Asn Glu Asn Ala Asp Phe He Glu Lys Ala 
465 470 475 480 

Gly Gly Met Glu Lys He Phe Asn Cys Gin Gin Asn Glu Asn Asp Lys 
485 490 495 

He Tyr Glu Lys Ala Tyr Lys He He Glu Thr Tyr Phe Gly Glu Glu 
500 505 510 

Glu Asp Ala val Asp Glu Thr Met Ala Pro Gin Asn Ala Gly Asn Thr 
520 525 



515 



Phe Glv Phe Gly Ser Asn Val Asn Gin Gin Phe Asn Phe Asn 
530 535 540 

INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 170 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: DNA 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
GGAGGCACCG AAGGGCAGCG CCGAGTCGGA GGGGGCGAAG ATTGACGCCA GTAAGAACGA 
GGAGGATGAA GGCCATTCAA ACTCCTCCCC ACGACACTCT GAAGCAGCGA CGGCACAGCG 
GGAAGAATGG AAAATGTTTA TAGGAGGCCT TAGCTGGGAC ACTACAAAGA 
(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1827 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : unknown 

( D ) TOPOLOGY : unknown 

(ii) MOLECULE TYPE: DNA 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..1362 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

GAG GTC AAT GTG GAG CTG AGG AAA GCT AAG AAG GAT GAC CAG ATG CTG 
Glu Val Asn Val Glu Leu Arg Lys Ala Lys Lys Asp Asp Gin Met Leu 
1 5 10 15 

AAG AGG AGA AAT GTA AGC TCA TTT CCT GAT GAT GCT ACT TCT CCG CTG 
Lys Arg Arg Asn Val Ser Ser Phe Pro Asp Asp Ala Thr Ser Pro Leu 
20 25 30 

CAG GAA AAC CGC AAC AAC CAG GGC ACT GTA AAT TGG TCT GTT GAT GAC 
Gin Glu Asn Arg Asn Asn Gin Gly Thr Val Asn Trp Ser Val Asp Asp 
35 40 45 

ATT GTC AAA GGC ATA AAT AGC AGC AAT GTG GAA AAT CAG CTC CAA GCT 
lie Val Lys Gly lie Asn Ser Ser Asn Val Glu Asn Gin Leu Gin Ala 
50 55 60 

ACT CAA GCT GCC AGG AAA CTA CTT TCC AGA GAA AAA CAG CCC CCC ATA 
Thr Gin Ala Ala Arg Lys Leu Leu Ser Arg Glu Lys Gin Pro Pro lie 
65 70 75 80 

GAC AAC ATA ATC CGG GCT GGT TTG ATT CCG AAA TTT GTG TCC TTC TTG 
Asp Asn lie lie Arg Ala Gly Leu lie Pro Lys Phe Val Ser Phe Leu 
85 90 95 

GGC AGA ACT GAT TGT AGT CCC ATT CAG TTT GAA TCT GCT TGG GCA CTC 
Gly Arg Thr Asp Cys Ser Pro lie Gin Phe Glu Ser Ala Trp Ala Leu 
100 105 110 

ACT AAC ATT GCT TCT GGG ACA TCA GAA CAA ACC AAG GCT GTG GTA GAT 
Thr Asn He Ala Ser Gly Thr Ser Glu Gin Thr Lys Ala Val Val Asp 
115 120 125 

GGA GGT GCC ATC CCA GCA TTC ATT TCT CTG TTG GCA TCT CCC CAT GCT 
Gly Gly Ala He Pro Ala Phe He Ser Leu Leu Ala Ser Pro His Ala 
130 135 140 

CAC ATC AGT GAA CAA GCT GTC TGG GCT CTA GGA AAC ATT GCA GGT GAT 
His He Ser Glu Gin Ala Val Trp Ala Leu Gly Asn He Ala Gly Asp 
145 150 155 160 

GGC TCA GTG TTC CGA GAC TTG GTT ATT AAG TAC GGT GCA GTT GAC CCA 
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Gly Ser Val Phe Arg Asp Leu Val He Lys Tyr Gly Ala Val Asp Pro 
165 170 175 



CTG TTG GCT CTC CTT GCA GTT CCT GAT ATG TCA TCT TTA GCA TGT GGC 
Leu Leu Ala Leu Leu Ala Val Pro Asp Met Ser Ser Leu Ala Cys Gly 
180 185 190 
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TAC TTA CGT AAT CTT ACC TGG ACA CTT TCT AAT CTT TGC CGC AAC AAG 
Tyr Leu Arg Asn Leu Thr Trp Thr Leu Ser Asn Leu Cys Arg Asn Lys 
195 200 205 
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AAT CCT GCA CCC CCG ATA GAT GCT GTT GAG CAG ATT CTT CCT ACC TTA 
Asn Pro Ala Pro Pro He Asp Ala Val Glu Gin He Leu Pro Thr Leu 
210 215 220 
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GTT CGG CTC CTG CAT CAT GAT GAT CCA GAA GTG TTA GCA GAT ACC TGC 
Val Arg Leu Leu His His Asp Asp Pro Glu Val Leu Ala Asp Thr Cys 
225 230 235 240 



720 



TGG GCT ATT TCC TAC CTT ACT GAT GGT CCA AAT GAA CGA ATT GGC ATG 
Trp Ala He Ser Tyr Leu Thr Asp Gly Pro Asn Glu Arg He Gly Met 
245 250 255 



768 



GTG GTG AAA ACA GGA GTT GTG CCC CAA CTT GTG AAG CTT CTA GGA GCT 
Val Val Lys Thr Gly Val Val Pro Gin Leu Val Lys Leu Leu Gly Ala 
260 265 270 



816 



TCT GAA TTG CCA ATT GTG ACT CCT GCC CTA AGA GCC ATA GGG AAT ATT 
Ser Glu Leu Pro He Val Thr Pro Ala Leu Arg Ala He Gly Asn He 
275 280 285 



864 



GTC ACT GGT ACA GAT GAA CAG ACT CAG GTT GTG ATT GAT GCA GGA GCA 
Val Thr Gly Thr Asp Glu Gin Thr Gin Val Val He Asp Ala Gly Ala 
290 295 300 



912 



CTC GCC GTC TTT CCC AGC CTG CTC ACC AAC CCC AAA ACT AAC ATT CAG 
Leu Ala Val Phe Pro Ser Leu Leu Thr Asn Pro Lys Thr Asn He Gin 
305 310 315 320 



960 



AAG GAA GCT ACG TGG ACA ATG TCA AAC ATC ACA GCC GGC CGC CAG GAC 
Lys Glu Ala Thr Trp Thr Met Ser Asn He Thr Ala Gly Arg Gin Asp 
325 330 " 335 



1008 



CAG ATA CAG CAA GTT GTG AAT CAT GGA TTA GTC CCA TTC CTT GTC AGT 
Gin He Gin Gin Val Val Asn His Gly Leu Val Pro Phe Leu Val Ser 
340 345 350 



1056 



GTT CTC TCT AAG GCA GAT TTT AAG ACA CAA AAG GAA GCT GTG TGG GCC 
Val Leu Ser Lys Ala Asp Phe Lys Thr Gin Lys Glu Ala Val Trp Ala 
355 360 365 



1104 



GTG ACC AAC TAT ACC AGT GGT GGA ACA GTT GAA CAG ATT GTG TAC CTT 
Val Thr Asn Tyr Thr Ser Gly Gly Thr Val Glu Gin He Val Tyr Leu 
370 375 380 



1152 



GTT CAC TGT GGC ATA ATA GAA CCG TTG ATG AAC CTC TTA ACT GCA AAA 
Val His Cys Gly He He Glu Pro Leu Met Asn Leu Leu Thr Ala Lys 
385 390 395 400 



1200 



GAT ACC AAG ATT ATT CTG GTT ATC CTG GAT GCC ATT TCA AAT ATC TTT 
Asp Thr Lys He He Leu Val He Leu Asp Ala He Ser Asn He Phe 
405 410 415 



1248 



CAG GCT GCT GAG AAA CTA GGT GAA ACT AGC TGC CCG TCT TCA CAG ATT 
Gin Ala Ala Glu Lys Leu Gly Glu Thr Ser Cys Pro Ser Ser Gin He 
420 425 430 



1296 
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CAA GAA CAA GGG AAA AGA CAG TAC AGA AAT GAG GCG TCC GAG GCG TCG 
Gin Glu Gin Gly Lys Arg Gin Tyr Arg Asn Glu Ala Ser Glu Ala Ser 
435 ^ 440 445 

CAG AAT AGA GAA ACT TAG TATAATGATT GAAGAATGTG GAGGCTTAGA 
Gin Asn Arg Glu Thr * 
450 

CAAAATTGAA GCTCTACAAA ACCATGAAAA TGAGTCTGTG TATAAGGCTT CGTTAAGCTT 
AATTGAGAAG TATTTCTCTG TAGAGGAAGA GGAAGATCAA AACGT TGT AC CAGAAACTAC 
CTCTGAAGGC TACACTTTCC AAGTTCAGGA TGGGGCTCCT GGGACCTTTA ACTTTTAGAT 
CATGTAGCTG AGACATAAAT TTGTTGTGTA CTACGTTTGG TATTTTGTCT TATTGTTTCT 
CTACTAAGAA CTCTTTCTTA AATGTGGTTT GTTACTGTAG CACTTTTTAC ACTGAAACTA 
TACTTGAACA GTTCCAACTG TACATACATA CTGTATGAAG CTTGTCCTCT GACTAGGTTT 
CTAATTTCTA TGTGGAATTT CCTATCTTGC AGCATCCTGT AAATAAACAT TCAAGTCCAC 
CCTTTTCTTG ACTTC 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 454 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

Glu Val Asn Val Glu Leu Arg Lys Ala Lys Lys Asp Asp Gin Met Leu 
1 5 10 15 

Lys Arg Arg Asn Val Ser Ser Phe Pro Asp Asp Ala Thr Ser Pro Leu 
20 25 30 

Gin Glu Asn Arg Asn Asn Gin Gly Thr Val Asn Trp Ser Val Asp Asp 
35 40 45 

lie Val Lys Gly lie Asn Ser Ser Asn Val Glu Asn Gin Leu Gin Ala 
50 55 60 

Thr Gin Ala Ala Arg Lys Leu Leu Ser Arg Glu Lys Gin Pro Pro lie 
65 70 75 80 

Asp Asn lie lie Arg Ala Gly Leu lie Pro Lys Phe Val Ser Phe Leu 
85 90 95 

Gly Arg Thr Asp Cys Ser Pro lie Gin Phe Glu Ser Ala Trp Ala Leu 
100 105 110 

Thr Asn lie Ala Ser Gly Thr Ser Glu Gin Thr Lys Ala Val Val Asp 
115 120 ~ 125 

Gly Gly Ala lie Pro Ala Phe lie Ser Leu Leu Ala Ser Pro His Ala 
130 135 140 

His lie Ser Glu Gin Ala Val Trp Ala Leu Gly Asn lie Ala Gly Asp 
145 150 155 160 

Gly Ser Val Phe Arg Asp Leu Val lie Lys Tyr Gly Ala Val Asp Pro 
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165 170 175 

Leu Leu Ala Leu Leu Ala Val Pro Asp Met Ser Ser Leu Ala Cys Gly 
180 185 190 

Tyr Leu Arg Asn Leu Thr Trp Thr Leu Ser Asn Leu Cys Arg Asn Lys 
195 200 205 

Asn Pro Ala Pro Pro lie Asp Ala Val Glu Gin lie Leu Pro Thr Leu 
210 215 220 

Val Arg Leu Leu His His Asp Asp Pro Glu Val Leu Ala Asp Thr Cys 
225 230 235 240 

Trp Ala lie Ser Tyr Leu Thr Asp Gly Pro Asn Glu Arg lie Gly Met 
245 250 255 

Val Val Lys Thr Gly Val Val Pro Gin Leu Val Lys Leu Leu Gly Ala 
260 265 270 

Ser Glu Leu Pro lie Val Thr Pro Ala Leu Arg Ala lie Gly Asn lie 
275 280 285 

Val Thr Gly Thr Asp Glu Gin Thr Gin Val Val lie Asp Ala Gly Ala 
290 295 300 

Leu Ala Val Phe Pro Ser Leu Leu Thr Asn Pro Lys Thr Asn lie Gin 
305 310 315 320 

Lys Glu Ala Thr Trp Thr Met Ser Asn lie Thr Ala Gly Arg Gin Asp 
325 330 335 

Gin lie Gin Gin Val Val Asn His Gly Leu Val Pro Phe Leu Val Ser 
340 345 350 

Val Leu Ser Lys Ala Asp Phe Lys Thr Gin Lys Glu Ala Val Trp Ala 
355 360 365 

Val Thr Asn Tyr Thr Ser Gly Gly Thr Val Glu Gin lie Val Tyr Leu 
370 375 380 

Val His Cys Gly lie lie Glu Pro Leu Met Asn Leu Leu Thr Ala Lys 
385 390 395 400 

Asp Thr Lys lie lie Leu Val lie Leu Asp Ala lie Ser Asn lie Phe 
405 410 415 

Gin Ala Ala Glu Lys Leu Gly Glu Thr Ser Cys Pro Ser Ser Gin lie 
420 425 430 

Gin Glu Gin Gly Lys Arg Gin Tyr Arg Asn Glu Ala Ser Glu Ala Ser 
435 ~ " 440 445 

Gin Asn Arg Glu Thr * 
450 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 259 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: DNA 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

GAACGACCAA GAGGGTGTTC GACTGCTAGA GCCGAGCAGA AGCGTGCCTA AATCAAAGGA 60 

ACTTGTTTCT TCAAGCTCTT CTGGCAGTGA TTCTGACAGT GAGGTTGACA AAAAGTTAAG 120 

CAGGAAAAAG CAAGTTGCTC CAGAAAAACC TGTAAAGAAA CAAAAGACAG GTGAGACTTC 180 

GAGAGCCCTG TCATCTTCTA AACAGAGCAG CAGCAGCAGA GATGATAACA TGTTTCAGAT 240 

TGGGAAAATG AGGTCAGTT 2 59 



(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 221 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

TGTCGACTGT GGCTTTGAGC ATCCGTCAGA AGTCCAGCAT GAGTGCATCC CTCAGGCCAT 60 

TCTGGGAATG GATGTCCTGT GCCAGGCCAA GTCGGGCATG GGAAAGACAG CAGTGTTTGT 120 

CTTGGCCACA CTGCAACAGC TGGAGCCAGT TACTGGGCAG GTGTCTGTAC TGGTGATGTG 180 

TCACACTCGG GAGTTGGCTT TTCAGATCAG CAAGGAATAT G 221 



(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 372 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

ATTTGTAAAC CCCGGAGCGA GGTTCTGCTT ACCCGAGGCC GCTGCTGTGC GGAGACCCCC 60 

GGGTGAAGCC ACCGTCATCA TGTCTGACCA GGAGGCAAAA CCTTCAACTG AGGACTTGGG 120 

GGATAAGAAG GAAGGTGAAT ATATTAAACT CAAAGTCATT GGACAGGATA GCAGTGAGAT 180 

TCACTTCAAA GTGAAAATGA CAACACATCT CAAGAAACTC AAAG AAT CAT ACTGTCAAAG 240 

ACAGGGTGTT CCAATGAATT CACTCAGGTT TCTCTTTGAG GGTCAGAGAA TTGCTGATAA 300 

TCATACTCCA AAAGAACTGG GAATGGAGGA AGAAGTTGTG ATTGAAGTTT ATCAGGAACA 360 
AACGGGGGGT CA "372 
(2) INFORMATION FOR SEQ ID NO: 19: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2675 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: DNA 



( ix ) FEATURE : 

(A) NAME/KEY: CDS 

(B) LOCATION: 104 ..2311 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

TCTGACCCTC GTCCCGCCCC CGCCATTCGC CGCCTCCTCC TGTCCCGCAG TCGGCGTCCA 60 

GCGGCTCTGC TTGTTCGTGT GTGTGTCGTT GCAGGCCTTA TTC ATG GGC TCA CCG 115 

Met Gly Ser Pro 
1 

CTG AGG TTC GAC GGG CGG GTG GTA CTG GTC ACC GGC GCG GGG GCA GGA 163 
Leu Arg Phe Asp Gly Arg Val Val Leu Val Thr Gly Ala Gly Ala Gly 
5 10 15 20 

TTG GGC CGA GCC TAT GCC CTG GCT TTT GCA GAA AGA GGA GCG TTA GTT 211 
Leu Gly Arg Ala Tyr Ala Leu Ala Phe Ala Glu Arg Gly Ala Leu Val 
25 30 35 

GTT GTG AAT GAT TTG GGA GGG GAC TTC AAA GGA GTT GGT AAA GGC TCC 2 59 

Val Val Asn Asp Leu Gly Gly Asp Phe Lys Gly Val Gly Lys Gly Ser 
40 45 50 

TTA GCT GAT AAG GTT GTT GAA GAA ATA AGA AGG AGA GGT GGA AAA GCA 307 
Leu Ala Asp Lys Val Val Glu Glu lie Arg Arg Arg Gly Gly Lys Ala 
55 60 65 

GTG GCC AAC TAT GAT TCA GTG GAA GAA GGA GAG AAG GTT GTG AAG ACA 355 
Val Ala Asn Tyr Asp Ser Val Glu Glu Gly Glu Lys Val Val Lys Thr 
70 75 80 

GCC CTG GAT GCT TTT GGA AGA ATA GAT GTT GTG GTC AAC AAT GCT GGA 403 
Ala Leu Asp Ala Phe Gly Arg lie Asp Val Val Val Asn Asn Ala Gly 
85 90 95 100 

ATT CTG AGG GAT CAT TCC TTT GCT AGG ATA AGT GAT GAA GAC TGG GAT 451 
lie Leu Arg Asp His Ser Phe Ala Arg lie Ser Asp Glu Asp Trp Asp 
105 110 ~ 115 

ATA ATC CAC AGA GTT CAT TTG CGG GGT TCA TTC CAA GTG ACA CGG GCA 499 
lie lie His Arg Val His Leu Arg Gly Ser Phe Gin Val Thr Arg Ala 
120 125 130 

GCA TGG GAA CAC ATG AAG AAA CAG AAG TAT GGA AGG ATT ATT ATG ACT 547 
Ala Trp Glu His Met Lys Lys Gin Lys Tyr Gly Arg lie lie Met Thr 
135 140 145 

TCA TCA GCT TCA GGA ATA TAT GGC AAC TTT GGC CAG GCC AAT TAT AGT 595 
Ser Ser Ala Ser Gly lie Tyr Gly Asn Phe Gly Gin Ala Asn Tyr Ser 
150 155 160 

GCT GCA AAG TTG GGT CTT CTG GGC CTT GCA AAT TCT CTT GCA ATT GAA 643 
Ala Ala Lys Leu Gly Leu Leu Gly Leu Ala Asn Ser Leu Ala lie Glu 
165 170 175 180 

GGC AGG AAA AGC AAC ATT CAT TGT AAC ACC ATT GCT CCT AAT GCG GGA 691 
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Gly Arg Lys Ser Asn lie His Cys Asn Thr lie Ala Pro Asn Ala Gly 

185 190 195 

TCA CGG ATG ACT CAG ACA GTT ATG CCT GAA GAT CTT GTG GAA GCC TTG 739 

Ser Arg Met Thr Gin Thr Val Met Pro Glu Asp Leu Val Glu Ala Leu 
200 205 210 

AAG CCA GAG TAT GTG GCA CCT CTT GTC CTT TGG CTT TGT CAC GAG AGT 787 

Lys Pro Glu Tyr Val Ala Pro Leu Val Leu Trp Leu Cys His Glu Ser 

215 220 225 

TGT GAG GAG AAT GGT GGC TTG TTT GAG GTT GGT GCA GGA TGG ATT GGA 835 

Cys Glu Glu Asn Gly Gly Leu Phe Glu Val Gly Ala Gly Trp lie Gly 
230 235 240 

AAA TTA CGC TGG GAG CGG ACT CTT GGA GCT ATT GTA AGA CAA AAG AAT 883 

Lys Leu Arg Trp Glu Arg Thr Leu Gly Ala lie Val Arg Gin Lys Asn 
245 250 255 260 

CAC CCA ATG ACT CCT GAG GCA GTC AAG GCT AAC TGG AAG AAG ATC TGT 931 

His Pro Met Thr Pro Glu Ala Val Lys Ala Asn Trp Lys Lys lie Cys 

265 270 275 

GAC TTT GAG AAT GCC AGC AAG CCT CAG AGT ATC CAA GAA TCA ACT GGC 979 

Asp Phe Glu Asn Ala Ser Lys Pro Gin Ser lie Gin Glu Ser Thr Gly 
280 285 290 

AGT ATA ATT GAA GTT CTG AGT AAA ATA GAT TCA GAA GGA GGA GTT TCA 1027 

Ser lie lie Glu Val Leu Ser Lys lie Asp Ser Glu Gly Gly Val Ser 

295 300 305 

GCA AAT CAT ACT AGT CGT GCA ACG TCT ACA GCA ACA TCA GGA TTT GCT 107 5 

Ala Asn His Thr Ser Arg Ala Thr Ser Thr Ala Thr Ser Gly Phe Ala 
310 315 320 

GGA GCT ATT GGC CAG AAA CTC CCT CCA TTT TCT TAT GCT TAT ACG GAA 112 3 

Gly Ala lie Gly Gin Lys Leu Pro Pro Phe Ser Tyr Ala Tyr Thr Glu 
325 330 335 340 

CTG GAA GCT ATT ATG TAT GCC CTT GGA GTG GGA GCG TCA ATC AAG GAT 1171 

Leu Glu Ala lie Met Tyr Ala Leu Gly Val Gly Ala Ser lie Lys Asp 

345 350 355 

CCA AAA GAT TTG AAA TTT ATT TAT GAA GGA AGT TCT GAT TTC TCC TGT 1219 

Pro Lys Asp Leu Lys Phe lie Tyr Glu Gly Ser Ser Asp Phe Ser Cys 
360 365 370 

TTG CCC ACC TTC GGA GTT ATC ATA GGT CAG AAA TCT ATG ATG GGT GGA 12 67 

Leu Pro Thr Phe Gly Val lie lie Gly Gin Lys Ser Met Met Gly Gly 

375 380 385 

GGA TTA GCA GAA ATT CCT GGA CTT TCA ATC AAC TTT GCA AAG GTT CTT 1315 

Gly Leu Ala Glu lie Pro Gly Leu Ser lie Asn Phe Ala Lys Val Leu 
390 395 400 

CAT GGA GAG CAG TAC TTA GAG TTA TAT AAA CCA CTT CCC AGA GCA GGA 1363 

His Gly Glu Gin Tyr Leu Glu Leu Tyr Lys Pro Leu Pro Arg Ala Gly 
405 410 ** 415 420 

AAA TTA AAA TGT GAA GCA GTT GTT GCT GAT GTC CTA GAT AAA GGA TCC 1411 

Lys Leu Lys Cys Glu Ala Val Val Ala Asp Val Leu Asp Lys Gly Ser 

425 430 435 

GGT GTA GTG ATT ATT ATG GAT GTC TAT TCT TAT TCT GAG AAG GAA CTT " 1459 

Gly Val Val lie lie Met Asp Val Tyr Ser Tyr Ser Glu Lys Glu Leu 
440 445 450 
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ATA TGC CAC AAT CAG TTC TCT CTC TTT CTT GTT GGC TCT GGA GGC TTT 1507 
lie Cys His Asn Gin Phe Ser Leu Phe Leu Val Gly Ser Gly Gly Phe 
455 460 465 

GGT GGA AAA CGG ACA TCA GAC AAA GTC AAG GTA GCT GTA GCC ATA CCT 1555 
Gly Gly Lys Arg Thr Ser Asp Lys Val Lys Val Ala Val Ala lie Pro 
470 475 480 

AAT AG A CCT CCT GAT GCT GTA CTT ACA GAT ACC ACC TCT CTT AAT CAG 1603 
Asn Arg Pro Pro Asp Ala Val Leu Thr Asp Thr Thr Ser Leu Asn Gin 
485 490 495 500 

GCT GCT TTG TAC CGC CTC AGT GGA GAC CGG AAT CCC TTA CAC ATT GAT 1651 
Ala Ala Leu Tyr Arg Leu Ser Gly Asp Arg Asn Pro Leu His lie Asp 
505 510 515 

CCT AAC TTT GCT AGT CTA GCA GGT TTT GAC AAG CCC ATA TTA CAT GGA 1699 
Pro Asn Phe Ala Ser Leu Ala Gly Phe Asp Lys Pro He Leu His Gly 
520 525 530 

TTA TGT ACA TTT GGA TTT TCT GCC AGG CGT GTG TTA CAG CAG TTT GCA 1747 
Leu Cys Thr Phe Gly Phe Ser Ala Arg Arg Val Leu Gin Gin Phe Ala 
535 540 545 

GAT AAT GAT GTG TCA AGA TTC AAG GCA GTT AAG GCT CGT TTT GCA AAA 1795 
Asp Asn Asp Val Ser Arg Phe Lys Ala Val Lys Ala Arg Phe Ala Lys 
550 555 560 

CCA GTA TAT CCA GGA CAA ACT CTA CAA ACT GAG ATG TGG AAG GAA GGA 1843 
Pro Val Tyr Pro Gly Gin Thr Leu Gin Thr Glu Met Trp Lys Glu Gly 
565 ~ 570 575 " 580 

AAC AGA ATT CAT TTT CAA ACC AAG GTC CAA GAA ACT GGA GAC ATT GTC 1891 
Asn Arg He His Phe Gin Thr Lys Val Gin Glu Thr Gly Asp He Val 
585 590 595 

ATT TCA AAT GCA TAT GTG GAT CTT GCA CCA ACA TCT GGT ACT TCA GCT 1939 
He Ser Asn Ala Tyr Val Asp Leu Ala Pro Thr Ser Gly Thr Ser Ala 
600 605 610 

AAG ACA CCC TCT GAG GGC GGG AAG CTT CAG AGT ACC TTT GTA TTT GAG 1987 
Lys Thr Pro Ser Glu Gly Gly Lys Leu Gin Ser Thr Phe Val Phe Glu 
615 620 625 

GAA ATA GGA CGC CGC CTA AAG GAT ATT GGG CCT GAG GTG GTG AAG AAA 2035 
Glu He Gly Arg Arg Leu Lys Asp He Gly Pro Glu Val Val Lys Lys 
630 635 640 

GTA AAT GCT GTA TTT GAG TGG CAT ATA ACC AAA GGC GGA AAT ATT GGG 2083 
Val Asn Ala Val Phe Glu Trp His He Thr Lys Gly Gly Asn He Gly 
645 650 655 660 

GCT AAG TGG ACT ATT GAC CTG AAA AGT GGT TCT GGA AAA GTG TAC CAA 2131 
Ala Lys Trp Thr He Asp Leu Lys Ser Gly Ser Gly Lys Val Tyr Gin 
665 " 670 675 • 

GGC CCT GCA AAA GGT GCT GCT GAT ACA ACA ATC ATA CTT TCA GAT GAA 2179 
Gly Pro Ala Lys Gly Ala Ala Asp Thr Thr He He Leu Ser Asp Glu 
680 685 690 

GAT TTC ATG GAG GTG GTC CTG GGC AAG CTT GAC CCT CAG AAG GCA TTC 2227 
Asp Phe Met Glu Val Val Leu Gly Lys Leu Asp Pro Gin Lys Ala Phe 
695 700 705 

TTT AGT GGC AGG CTG AAG GCC AGA GGG AAC ATC ATG CTG AGC CAG AAA 227 5 

Phe Ser Gly Arg Leu Lys Ala Arg Gly Asn He Met Leu Ser Gin Lys 
710 715 ^ 720 
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CTT CAG ATG ATT CTT AAA GAC TAC GCC AAG CTC TGA AGGGCACACT 2321 
Leu Gin Met lie Leu Lys Asp Tyr Ala Lys Leu * 
725 730 735 



ACACTATTAA 


TAAAAATGGA 


ATCATTAAAT 


ACTCTCTTCA 


CCCAAATATG 


CTTGATTATT 


2381 


CTGCAAAAGT 


GATTAGAACT 


AAGATGCAGG 


GGAAATTGCT 


TAACATTTTC 


AGATATCAGA 


2441 


TAACTGCAGA 


TTTTCATTTT 


CTACTAATTT 


TTCATGTATC 


ATTATTTTTA 


CAAGGAACTA 


2501 


TATATAAGCT 


AGCACATAAT 


TATCCTTCTG 


TTCTTAGATC 


TGTATCTTCA 


TAATAAAAAA 


2561 


ATTTTGCCCA 


AGTCCTGTTT 


CCTTAGAATT 


TGTGATAGCA 


TTGATAAGTT 


GAAAGGAAAA 


2621 


TTAAATCAAT 


AAAGGCCTTT 


GATACCTTTA 


AAAAAAAAAA 


AAAAAAAAAA 


AAAA 


2675 



(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 736 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

Met Gly Ser Pro Leu Arg Phe Asp Gly Arg Val Val Leu Val Thr Gly 
1 5 10 15 

Ala Gly Ala Gly Leu Gly Arg Ala Tyr Ala Leu Ala Phe Ala Glu Arg 
20 25 30 

Gly Ala Leu Val Val Val Asn Asp Leu Gly Gly Asp Phe Lys Gly Val 
35 40 45 

Gly Lys Gly Ser Leu Ala Asp Lys Val Val Glu Glu lie Arg Arg Arg 
50 55 60 

Gly Gly Lys Ala Val Ala Asn Tyr Asp Ser Val Glu Glu Gly Glu Lys 
65 70 7 5 80 

Val Val Lys Thr Ala Leu Asp Ala Phe Gly Arg lie Asp Val Val Val 
85 90 95 

Asn Asn Ala Gly lie Leu Arg Asp His Ser Phe Ala Arg lie Ser Asp 
100 105 110 

Glu Asp Trp Asp lie lie His Arg Val His Leu Arg Gly Ser Phe Gin 
115 120 125 

Val Thr Arg Ala Ala Trp Glu His Met Lys Lys Gin Lys Tyr Gly Arg 
130 " 135 140 

lie lie Met Thr Ser Ser Ala Ser Gly lie Tyr Gly Asn Phe Gly Gin 
145 150 ^ 155 ^ 160 

Ala Asn Tyr Ser Ala Ala Lys Leu Gly Leu Leu Gly Leu Ala Asn Ser 
165 170 175 

Leu Ala lie Glu Gly Arg Lys Ser Asn lie His Cys Asn Thr lie Ala 
180 ~ 185 190 

Pro Asn Ala Gly Ser Arg Met Thr Gin Thr Val Met Pro Glu Asp Leu 
195 200 205 
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Val Glu Ala Leu Lys Pro Glu Tyr Val Ala Pro Leu Val Leu Trp Leu 
210 215 220 

Cys His Glu Ser Cys Glu Glu Asn Gly Gly Leu Phe Glu Val Gly Ala 
225 230 235 240 

Gly Trp lie Gly Lys Leu Arg Trp Glu Arg Thr Leu Gly Ala lie Val 
245 250 255 

Arg Gin Lys Asn His Pro Met Thr Pro Glu Ala Val Lys Ala Asn Trp 
260 265 270 

Lys Lys lie Cys Asp Phe Glu Asn Ala Ser Lys Pro Gin Ser lie Gin 
275 280 285 

Glu Ser Thr Gly Ser lie lie Glu Val Leu Ser Lys lie Asp Ser Glu 
290 295 300 

Gly Gly Val Ser Ala Asn His Thr Ser Arg Ala Thr Ser Thr Ala Thr 
305 310 315 320 

Ser Gly Phe Ala Gly Ala lie Gly Gin Lys Leu Pro Pro Phe Ser Tyr 
325 - 330 335 

Ala Tyr Thr Glu Leu Glu Ala lie Met Tyr Ala Leu Gly Val Gly Ala 
340 345 350 

Ser lie Lys Asp Pro Lys Asp Leu Lys Phe lie Tyr Glu Gly Ser Ser 
355 360 365 

Asp Phe Ser Cys Leu Pro Thr Phe Gly Val lie lie Gly Gin Lys Ser 
370 375 380 

Met Met Gly Gly Gly Leu Ala Glu lie Pro Gly Leu Ser lie Asn Phe 
385 390 395 400 

Ala Lys Val Leu His Gly Glu Gin Tyr Leu Glu Leu Tyr Lys Pro Leu 
405 410 415 

Pro Arg Ala Gly Lys Leu Lys Cys Glu Ala Val Val Ala Asp Val Leu 
420 425 430 

Asp Lys Gly Ser Gly Val Val lie lie Met Asp Val Tyr Ser Tyr Ser 
435 440 445 

Glu Lys Glu Leu lie Cys His Asn Gin Phe Ser Leu Phe Leu Val Gly 
450 455 460 

Ser Gly Gly Phe Gly Gly Lys Arg Thr Ser Asp Lys Val Lys Val Ala 
465 470 " 475 " 480 

Val Ala lie Pro Asn Arg Pro Pro Asp Ala Val Leu Thr Asp Thr Thr 
485 490 495 

Ser Leu Asn Gin Ala Ala Leu Tyr Arg Leu Ser Gly Asp Arg Asn Pro 
500 505 510 

Leu His lie Asp Pro Asn Phe Ala Ser Leu Ala Gly Phe Asp Lys Pro 
515 520 525 

He Leu His Gly Leu Cys Thr Phe Gly Phe Ser Ala Arg Arg Val Leu 
530 535 540 

Gin Gin Phe Ala Asp Asn Asp Val Ser Arg Phe Lys Ala Val Lys Ala 
545 550 ^ 555 * 560 

Arg Phe Ala Lys Pro Val Tyr Pro Gly Gin Thr Leu Gin Thr Glu Met 
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565 570 575 

Trp Lys Glu Gly Asn Arg He His Phe Gin Thr Lys Val Gin Glu Thr 
580 585 590 

Gly Asp He Val He Ser Asn Ala Tyr Val Asp Leu Ala Pro Thr Ser 
595 600 605 

Gly Thr Ser Ala Lys Thr Pro Ser Glu Gly Gly Lys Leu Gin Ser Thr 
610 615 620 

Phe Val Phe Glu Glu He Gly Arg Arg Leu Lys Asp He Gly Pro Glu 
625 630 635 640 

Val Val Lys Lys Val Asn Ala Val Phe Glu Trp His He Thr Lys Gly 
645 650 655 

Gly Asn He Gly Ala Lys Trp Thr He Asp Leu Lys Ser Gly Ser Gly 
660 665 670 

Lys Val Tyr Gin Gly Pro Ala Lys Gly Ala Ala Asp Thr Thr He He 
675 680 685 

Leu Ser Asp Glu Asp Phe Met Glu Val Val Leu Gly Lys Leu Asp Pro 
690 695 700 

Gin Lys Ala Phe Phe Ser Gly Arg Leu Lys Ala Arg Gly Asn He Met 
705 710 715 720 

Leu Ser Gin Lys Leu Gin Met He Leu Lys Asp Tyr Ala Lys Leu * 
725 730 735 
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